77 research outputs found

    Studio report: sound synthesis with DDSP and network bending techniques

    Get PDF
    This paper reports on our experiences synthesizing sounds and building network bending functionality onto the Differentiable Digital Signal Processing (DDSP) system. DDSP is an extension to the TensorFlow API with which we can embed trainable signal processing nodes in neural networks. Comparing DDSP sound synthesis networks to preset finding networks and sample level synthesis networks, we argue that it offers a third mode of working, providing continuous control in real-time of high fidelity synthesizers using low numbers of control parameters. We describe two phases of our experimentation. Firstly we worked with a composer to explore different training datasets and parameters. Secondly, we extended DDSP models with network bending functionality, which allows us to feed additional control data into the network's hidden layers and achieve new timbral effects. We describe several possible network bending techniques and how they affect the sound

    Friend Me Your Ears: A Musical Approach to Human-Robot Relationships.

    Get PDF
    PhDA relationship is something that is necessarily built up over time, however, Human-Robot Interaction (HRI) trials are rarely extended beyond a single ses- sion. These studies are insufficient for examining multi-interaction scenarios, which will become commonplace if the robot is situated in a workplace or adopts a role that is part of a human's routine. Long term studies that have been exe- cuted often demonstrate a declining novelty effect. Music, however, provides an opportunity for affective engagement, shared creativity, and social activity. This being said, it is unlikely that a robot best equipped to build sustainable and meaningful relationships with humans will be one that can solely play music. In their day-to-day lives, most humans encounter machines and computer programs capable of executing impressively complex tasks to a high standard that may provide them with hours of engagement. In order to have anything that that could be classed as a social relationship, the human must have the sense that their interactions are taking place with another, a phenomenon known as social presence. In this thesis, we examine whether the addition of simulated social behaviours will improve a sense of believability or social presence, which, along with an engaging musical interaction, will allow us to move towards something that could be called a human-robot relationship. First, we conducted a large online survey to gain insight into relationships based in regular music activ- ity. Using these results, we designed, constructed and programmed Mortimer, a robotic system capable of playing the drums and a responsive composition algorithm to best meet these aims. This robot was then used in a series of studies, one single session and two long-term, testing various simulated social behaviours to compliment the musical improvisation. These experiments and their results address the paucity of long-term studies both speci cally in Social Robotics and in the broader HRI eld, and provide a promising insight into a possible solution to generally poor outcomes in this area. This conclusion is based upon the model of a positive human-robot relationship and the method- ological approach of automated behavioural metrics to evaluate robotic systems in this regard developed and detailed within the thesis.the EPSRC as part of the Media and Arts Tech-nology Doctoral Training Centre, EP/G03723X/2

    Creating Latent Spaces for Modern Music Genre Rhythms Using Minimal Training Data

    Get PDF
    In this paper we present R-VAE, a system designed for the exploration of latent spaces of musical rhythms. Unlike most previous work in rhythm modeling, R-VAE can be trained with small datasets, enabling rapid customization and exploration by individual users. R-VAE employs a data representation that encodes simple and compound meter rhythms. To the best of our knowledge, this is the first time that a network architecture has been used to encode rhythms with these characteristics, which are common in some modern popular music genres

    Supporting Feature Engineering in End-User Machine Learning

    Full text link
    A truly human-centred approach to Machine Learning (ML) must consider how to support people modelling phenomena beyond those receiving the bulk of industry and academic attention, including phenomena relevant only to niche communities and for which large datasets may never exist. While deep feature learning is often viewed as a panacea that obviates the task of feature engineering, it may be insufficient to support users with small datasets, novel data sources, and unusual learning problems. We argue that it is therefore necessary to investigate how to support users who are not ML experts in deriving suitable feature representations for new ML problems. We also report on the results of a preliminary study comparing user-driven and automated feature engineering approaches in a sensor-based gesture recognition task

    The challenge of feature engineering in programming for moving bodies

    Full text link
    The design of bespoke human movement analysis and control systems by end users and other people without programming or signal processing expertise presents great opportunities for the arts, accessible interface design, games, and other domains. In this paper, we describe the challenge of feature engineering that confronts many people wishing to build such systems. We have conducted three studies exploring approaches to supporting feature engineering and investigated how such approaches may impact on system accuracy, user experience, and design outcomes. We briefly outline study outcomes that are most relevant to the workshop themes

    Network Bending Neural Vocoders

    Full text link
    Network bending [1] aims to elicit interesting creative output from generative neural networks by applying various transformations to the activations of groups of network nodes. This paper describes the investigation of how this emerging technique of ‘network bending’ can be used to provide novel creative control over sound synthesis networks based on the Magenta DDSP API [2] and how best to provide access to the resulting sound synthesis neural networks to creative practitioners

    Examining Student Coding Behaviours in Creative Computing Lessons using Abstract Syntax Trees and Vocabulary Analysis

    Get PDF
    Creative computing is an approach to computing education which emphasises the creation of interactive audiovisual software and an art-school influenced pedagogy. Given this emphasis on Dewey’s "learning by doing”, we set out to investigate the processes students use to develop their programs. We refer to these processes as the students’ ‘coding behaviour’, and we expect that understanding it will provide us with valuable information about how students learn in our creative computing classes. As existing metrics were not sufficient, we introduce a new set of quantitative metrics to describe coding behaviours. The metrics consider factors such as students’ vocabulary use and development, how fast and how much they alter the functionality of code over time and how they iterate on their code through text insert and delete operations. Many of our lessons involve providing students with demonstrator code which they use as a base for the development of their programs, so we use demo code as an entry point to our dataset. We look at programs students have written through developing the demo code in a dataset of over 16,000 programs. We clustered the demo code using the set of descriptive metrics. This lead to a set of clusters containing programs which are associated with distinct coding behaviours. Four was the ideal number of clusters for cluster density and separation. We found that the clusters had distinct behaviour patterns, that they were associated with different instructors and that they contained demo programs with different lengths

    R-VAE: Live latent space drum rhythm generation from minimal-size datasets

    Get PDF
    In this article, we present R-VAE, a system designed for the modeling and exploration of latent spaces learned from rhythms encoded in MIDI clips. The system is based on a variational autoencoder neural network, uses a data structure that is capable of encoding rhythms in simple and compound meter, and can learn models from little training data. To facilitate the exploration of models, we implemented a visualizer that relies on the dynamic nature of the pulsing rhythmic patterns. To test our system in real-life musical practice, we collected small-scale datasets of contemporary music genre rhythms and trained models with them. We found that the non-linearities of the learned latent spaces coupled with tactile interfaces to interact with the models were very expressive and led to unexpected places in musical composition and live performance settings. A music album was recorded and it was premiered at a major music festival using the VAE latent space on stage

    Generation and visualization of rhythmic latent spaces

    Full text link
    In this paper we extend R-VAE, a system designed for the modeling and exploration of latent spaces of musical rhythms. R-VAE employs a data representation that encodes simple and compound me- ter rhythms, common in some contemporary popular music genres. It can be trained with small datasets, enabling rapid customization and exploration by individual users. To facilitate the exploration of the la- tent space, we provide R-VAE with a web-based visualizer designed for the dynamic representation of rhythmic latent spaces. To the best of our knowledge, this is the first time that a dynamic visualization has been implemented to observe a latent space learned from rhythmic patterns

    Contemporary Machine Learning for Audio and Music Generation on the Web: Current Challenges and Potential Solutions

    Get PDF
    We evaluate specific Web-based technologies that can be used to implement complex contemporary Machine Learning systems for Computer Music research, in particular for the problem of audio signal generation. As a result of greater investment from large corporations including Google and Facebook in areas such as the development of Web-based, accelerated, cross-platform Machine Learning libraries, alongside greater interest and engagement from the academic community in exploring such approaches, Machine Learning is becoming much more prevalent on the Web. This could have great potential impact for Computer Music research, acting to democratise access to complex, accelerated Machine Learning technologies through increased usability and flexibility, in tandem with clear documentation and examples. However, some problems remain in relation to the creation of more complete Machine Learning pipe-lines for Music and Sound generation. We discuss some key potential challenges in this area, and attempt to evaluate some relevant solutions for developing more accessible Computer Music Machine Learning systems
    • 

    corecore